๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ“„ Text Chunking

Semantic Segmentation, Context Windows, Document Boundaries, Retrieval Units

Why Your Chunking Strategy Makes or Breaks Your AI System
medium.comยท4dยท
Discuss: Hacker News
๐Ÿ“„Semantic Chunking
StoryGem: Voronoi treemap Approach for Semantics-Preserving Text Visualization
arxiv.orgยท1d
๐Ÿ”ถVoronoi Diagrams
davidchisnall/igk: I got Knuth'd: A compiler for documents
github.comยท7h
๐Ÿ“Concrete Syntax
Which Vision Language Models Should You Use for Your Apps
thenewstack.ioยท1d
๐Ÿค–Advanced OCR
Why Your Next LLM Might Not Have A Tokenizer
towardsdatascience.comยท18h
๐Ÿค–Grammar Induction
Clustering News Articles for Topic Detection: A Technical Deep Dive
dev.toยท3dยท
Discuss: DEV
๐Ÿ“šDocument Clustering
June 25, 2025 Flight Tracking Workshop (4 hour) [Americas / Europe-friendly time]
bellingcat.comยท14h
๐ŸงฎProlog Parsing
New: Improve Apache Iceberg query performance in Amazon S3 with sort and z-order compaction
aws.amazon.comยท17h
๐Ÿ”„Burrows-Wheeler
ByteSpan: Information-Driven Subword Tokenisation
arxiv.orgยท1d
๐Ÿ’พBinary Linguistics
Markov-Enhanced Clustering for Long Document Summarization: Tackling the 'Lost in the Middle' Challenge with Large Language Models
arxiv.orgยท1d
๐Ÿ“ฅFeed Aggregation
Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated
arxiv.orgยท10h
๐ŸงฎKolmogorov Complexity
Practical tips to optimize documentation for LLMs, AI agents, and chatbots
biel.aiยท19hยท
Discuss: Hacker News
๐Ÿค–Archive Automation
What LLMs Know About Their Users
schneier.comยท3h
๐Ÿ’ปLocal LLMs
Portable Network Graphics (PNG) Specification (Third Edition)
w3.orgยท17hยท
Discuss: Hacker News
๐Ÿ•ธ๏ธWebP Analysis
The modern text processing pipeline: Overview
newroadoldway.comยท1dยท
Discuss: Lobsters, r/programming
๐Ÿ”คUnicode Normalization
Automattic/harper: Offline, privacy-first grammar checker. Fast, open-source, Rust-powered
github.comยท1d
๐Ÿ“Concrete Syntax
PDF Retrieval Augmented Question Answering
arxiv.orgยท1d
๐Ÿ“ŠMulti-vector RAG
How to sync Context across AI Assistants (ChatGPT, Claude, Perplexity...) in your browser
dev.toยท21hยท
Discuss: DEV
๐Ÿ–ฅ๏ธModern Terminals
BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIP
arxiv.orgยท1d
๐Ÿ–ผ๏ธJPEG XL
Semantic-Aware Parsing for Security Logs
arxiv.orgยท1d
๐Ÿ“Log Parsing
Loading...Loading more...
AboutBlogChangelogRoadmap